Skip to content

Conversation

@matyascimbulka
Copy link
Contributor

@matyascimbulka matyascimbulka commented Sep 8, 2025

This PR brings improvements to Apify components directly from Apify developers.

New Changes

  • Common
    • Migrated from Axios HTTP client to Apify client
  • Run Actor
    • Allow users to chose from either recently run Actors or store Actors
    • Display only tagged builds
    • Use prefill values from input schema as default
  • Scrape single URL
    • Return the only result only after the run has finished
  • Get dataset item
    • Add offset as input parameter

Summary by CodeRabbit

  • New Features

    • Actor runs: choose actor source (recently-used/store), select builds by tag, set max items and total charge; improved sync/async flows with output retrieval.
    • Scrape Single URL: now runs via Web Content Crawler, choose crawler engine; URL required.
    • Key-Value Store: accept any value type with automatic JSON/text inference.
  • Improvements

    • Dataset/task listing: added limit and offset; run-task returns run metadata plus dataset items and clearer failure messages.
    • Webhook event options and actor/build selection improved.
  • Refactor

    • Migrated API client to ApifyClient; unified labeling and pagination props.
  • Chores

    • Dependency updates and package version bumps.

matyascimbulka and others added 17 commits July 23, 2025 13:54
… and change how Actor or task name is displayed
… finished (#3)

* fix(scrape-single-url): return the only dataset item after the run is finished

* fix(scrape-single-url): version up

* fix(scrape-single-url): version up

* fix(scrape-single-url): version up

* fix(scrape-single-url): introduce a job status constant, expand a list of terminal statuses to stop the loop

* fix(scrape-single-url): import constants from package, decrease delay in between calls
* feat(apify): Replace Axios with Apify client

* fix(general): adding custom headers to client()=> preserve whole config to be passed to Axios later

* feat(general): add linter script

* fix(apify-get-dataset-items): a function for getting items and parsing of a result

* fix(apify-run-actor): working sync and async, dynamic input schema injection, KVS output retrieval tested only string

* fix(general): change maxResults for limit as an input field

* fix(run-task-sync): move items retrieval to the component, add waitSecs determined by input or plan to prevent blunt timeout error, have the item retrieval logic be connected to run status, clean return value

* fix(apify-scrape-single-url): incorporate timeouts, rework the whole API interaction logic

* fix(apify-set-key-value-store-record): detection of content type, fixed API interaction

* fix(apify-scrape-single-url): remove waiting timeout, return only dataset item, remove extra input fields connected to WCC run

* fix(apify-run-actor): success message

* fix(apify-run-task-synchronously): remove waiting for run to finish timeout

* fix(app): remove paidPlan input filed config

---------

Co-authored-by: Matyas Cimbulka <[email protected]>
This reverts commit 6040822 which for some reason bumped version of the wrong components.
@adolfo-pd adolfo-pd added the User submitted Submitted by a user label Sep 8, 2025
@vercel
Copy link

vercel bot commented Sep 8, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments
Project Deployment Preview Comments Updated (UTC)
pipedream-docs Ignored Ignored Sep 23, 2025 8:08am
pipedream-docs-redirect-do-not-edit Ignored Ignored Sep 23, 2025 8:08am

@pipedream-component-development
Copy link
Collaborator

Thank you so much for submitting this! We've added it to our backlog to review, and our team has been notified.

@pipedream-component-development
Copy link
Collaborator

Thanks for submitting this PR! When we review PRs, we follow the Pipedream component guidelines. If you're not familiar, here's a quick checklist:

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 8, 2025

Walkthrough

Migrates Apify integration to ApifyClient, renames several public props (maxResults→limit, flatten→offset, buildId→buildTag, ACTOR_ID→WCC_ACTOR_ID), reworks run/task/scrape/KV flows (schema-driven actor runs, async vs sync returns, KV type inference), updates webhook handling, and bumps multiple versions.

Changes

Cohort / File(s) Change Summary
Core client, API wrappers & constants
components/apify/apify.app.mjs, components/apify/common/constants.mjs, components/apify/package.json
Replace ad-hoc HTTP layer with ApifyClient and a _client() wrapper; add helpers (getActor, getActorRun, getBuild, listDatasetItems, setKeyValueStoreRecord, formatActorOrTaskLabel); rename ACTOR_IDWCC_ACTOR_ID, remove EVENT_TYPES; rename props (maxResultslimit), add offset; update method signatures; add apify-client dependency; remove got-scraping.
Run Actor action
components/apify/actions/run-actor/run-actor.mjs
Major rewrite: actor validation via getActor, schema resolution via getSchema(actorId, buildTag), dynamic UI props (actorSource, buildTag, maxItems, maxTotalChargeUsd), prepareData/options mapping, webhook event types from WEBHOOK_EVENT_TYPES, separate async vs sync run flows (async returns run; sync returns { run, output }), version bump.
Run Task Synchronously action
components/apify/actions/run-task-synchronously/run-task-synchronously.mjs
Renamed maxResultslimit; validate run status with ACTOR_JOB_STATUSES; remove dataset options from run call and instead call listDatasetItems with dataset params (clean/fields/omit/flatten/limit); return enriched run metadata plus items; version bump.
Get Dataset Items action
components/apify/actions/get-dataset-items/get-dataset-items.mjs
Version bump; public props renamed flattenoffset, maxResultslimit; listDatasetItems now destructures { items }; use this.offset/this.limit and trim results to limit.
Scrape Single URL action
components/apify/actions/scrape-single-url/scrape-single-url.mjs
Replace direct scraping with running Web Content Crawler actor (WCC_ACTOR_ID); add crawlerType prop; make url required; validate run status using ACTOR_JOB_STATUSES; fetch dataset items and return first item; update summary and version.
Key-Value Store record action
components/apify/actions/set-key-value-store-record/set-key-value-store-record.mjs
value now any; add inferFromValue and looksLikeJson methods to infer data/contentType/mode; require key/keyValueStoreId; call setKeyValueStoreRecord({ storeId, key, value, contentType }); remove parseObject; return structured result and updated summary; version bump.
Sources & webhook handling
components/apify/sources/common/base.mjs, components/apify/sources/new-finished-actor-run-instant/..., components/apify/sources/new-finished-task-run-instant/...
Flatten webhook create payload to top-level fields, use WEBHOOK_EVENT_TYPE_GROUPS.ACTOR_RUN_TERMINAL, read response id at top level, update actorId propDefinition key, bump source versions.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as User
  participant Action as Run Actor Action
  participant App as Apify App Wrapper
  participant API as Apify API

  U->>Action: configure actorId, buildTag, input, options
  Action->>App: getActor(actorId)
  App->>API: GET /v2/acts/{actorId}
  API-->>App: actor details (taggedBuilds)
  App-->>Action: actor details
  Action->>Action: getSchema(actorId, buildTag)
  Action->>App: getBuild(actorId, buildTag)
  App->>API: GET /v2/acts/{actorId}/builds/{tag}
  API-->>App: build info
  App-->>Action: build info
  Action->>App: runActor({ actorId, input, options })
  App->>API: POST /v2/acts/{actorId}/runs
  API-->>App: run (sync or async)
  alt Async
    App-->>Action: run object
    Action-->>U: return run
  else Sync
    App-->>Action: run finished
    Action->>App: getActorRun(runId) + fetch OUTPUT KVS
    App->>API: GET run + GET KVS record
    API-->>App: run + OUTPUT
    App-->>Action: { run, output }
    Action-->>U: return { run, output }
  end
Loading
sequenceDiagram
  autonumber
  participant U as User
  participant Action as Scrape Single URL
  participant App as Apify App Wrapper
  participant API as Apify API
  note over Action: Uses WCC_ACTOR_ID

  U->>Action: url, crawlerType
  Action->>App: runActor({ actorId: WCC, input })
  App->>API: POST /acts/{WCC}/runs
  API-->>App: run (status, defaultDatasetId, consoleUrl)
  App-->>Action: run
  alt status != SUCCEEDED
    Action-->>U: throw Error (includes consoleUrl)
  else
    Action->>App: listDatasetItems({ datasetId, params: { limit:1 } })
    App->>API: GET /datasets/{id}/items
    API-->>App: { items }
    App-->>Action: items
    Action-->>U: return items[0]
  end
Loading
sequenceDiagram
  autonumber
  participant U as User
  participant Action as Run Task Synchronously
  participant App as Apify App Wrapper
  participant API as Apify API

  U->>Action: taskId, dataset params (fields/omit/flatten/limit)
  Action->>App: runTaskSynchronously({ taskId })
  App->>API: POST /actor-tasks/{taskId}/runs/sync
  API-->>App: run result (status, defaultDatasetId, consoleUrl)
  App-->>Action: run result
  alt status != SUCCEEDED
    Action-->>U: throw Error (includes consoleUrl)
  else
    Action->>App: listDatasetItems({ datasetId, params })
    App->>API: GET /datasets/{id}/items
    API-->>App: { items }
    App-->>Action: items
    Action-->>U: return { run metadata, items }
  end
Loading
sequenceDiagram
  autonumber
  participant Action as Set KV Store Record
  participant App as Apify App Wrapper
  participant API as Apify API

  Action->>Action: inferFromValue(value) -> { data, contentType, mode }
  Action->>App: setKeyValueStoreRecord({ storeId, key, value: data, contentType })
  App->>API: PUT /key-value-stores/{id}/records/{key} (contentType)
  API-->>App: response
  App-->>Action: response
  Action-->>Action: build summary + structured return
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60–90 minutes

Poem

I nibbled through the client tree,
Swapped crumbs for ApifyClient tea.
Builds now wear a proper tag,
Datasets fetched with offset bag.
Keys confess if JSON’s true — a rabbit hops, commits anew. 🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The repository requires the PR description to follow the provided template which includes a "## WHY" section, but the submitted description omits that required heading and does not explain motivation, user impact, migration steps, or breaking changes; it only lists high-level changes without the contextual rationale the template asks for. Please update the PR description to follow the repository template by adding a "## WHY" section that explains the motivation and user impact, call out breaking API or prop renames (e.g., flatten→offset, maxResults→limit, run return-shape changes), include any required migration or upgrade steps and testing performed, and ensure reviewers can quickly see the scope and rationale of the changes.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title "[APP] Apify - Improvements to various components" accurately signals the PR is an Apify-related batch of improvements and matches the multi-file changes (client migration, actor/run/dataset updates) in the diff, but it is fairly broad and does not call out the primary technical change such as the migration to ApifyClient and notable public API renames.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@matyascimbulka matyascimbulka reopened this Sep 8, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
components/box/actions/download-file/download-file.mjs (1)

1-5: Sanitize fileName to prevent path traversal and ensure safe writes to /tmp.
Using /tmp/${this.fileName} allows ../ segments and could write outside /tmp. Use path.basename and path.join.

Apply:

 import app from "../../box.app.mjs";
 import fs from "fs";
 import stream from "stream";
 import { promisify } from "util";
+import path from "path";
@@
-    const filePath = `/tmp/${this.fileName}`;
+    const safeName = path.basename(this.fileName);
+    const filePath = path.join("/tmp", safeName);

Also applies to: 52-56

components/box/actions/get-comments/get-comments.mjs (1)

34-41: Likely runtime bug: missing await before getResourcesStream.
Else for await (...) iterates a Promise, which is not an async iterable.

-    const resourcesStream = utils.getResourcesStream({
+    const resourcesStream = await utils.getResourcesStream({
       resourceFn: this.app.getComments,
       resourceFnArgs: {
         $,
         fileId: this.fileId,
       },
     });
components/apify/sources/common/base.mjs (1)

38-39: Bug: typo in fallback ID uses createAt instead of createdAt

This breaks dedupe IDs for some events.

-      id: body.eventData.actorRunId || `${body.userId}-${body.createAt}`,
+      id: body.eventData.actorRunId || `${body.userId}-${body.createdAt}`,
components/apify/actions/scrape-single-url/scrape-single-url.mjs (1)

8-9: Update description: output is JSON item, not raw HTML

Doc currently promises HTML but the action returns the first dataset item.

-  description: "Executes a scraper on a specific website and returns its content as HTML. This action is perfect for extracting content from a single page. [See the documentation](https://docs.apify.com/sdk/js/docs/examples/crawl-single-url)",
+  description: "Runs Apify's Web Content Crawler (WCC) on a single URL and returns the first extracted result (JSON) after the run finishes. Ideal for scraping one page end-to-end.",
components/apify/actions/run-task-synchronously/run-task-synchronously.mjs (1)

56-61: Potential type error: flatten is joined like an array.

On Apify’s dataset API, flatten is a boolean. Calling .join() will throw if flatten is boolean. Align with how get-dataset-items dropped flatten or pass it as boolean.

Apply this diff in the dataset fetch params (see lines 95–104) to stop joining:

-        flatten: this.flatten && this.flatten.join(),
+        flatten: this.flatten,

Optionally, remove the flatten prop entirely for consistency with get-dataset-items.

🧹 Nitpick comments (14)
components/box/actions/upload-file-version/upload-file-version.mjs (2)

61-70: Remove unused params to reduce confusion.
createdAt and parentId aren’t defined props for version upload; pass only what’s used.

-    const {
-      file, fileId, createdAt, modifiedAt, fileName, parentId,
-    } = this;
+    const {
+      file, fileId, modifiedAt, fileName,
+    } = this;
@@
-    const data = await this.getFileUploadBody({
-      file,
-      createdAt,
-      modifiedAt,
-      fileName,
-      parentId,
-    });
+    const data = await this.getFileUploadBody({
+      file,
+      modifiedAt,
+      fileName,
+    });

75-77: Prefer FormData helpers over private _boundary.
Use getHeaders() or getBoundary() when available.

-      headers: {
-        "Content-Type": `multipart/form-data; boundary=${data._boundary}`,
-      },
+      headers: data.getHeaders?.() ?? {
+        "Content-Type": `multipart/form-data; boundary=${data.getBoundary?.() ?? data._boundary}`,
+      },
components/box/sources/new-file/new-file.mjs (1)

6-6: Polish description grammar.
Minor wording tweak for clarity.

-  description: "Emit new event when a new file uploaded on a target. [See the documentation](https://developer.box.com/reference/post-webhooks)",
+  description: "Emit an event when a new file is uploaded. [See the documentation](https://developer.box.com/reference/post-webhooks)",
components/box/sources/new-folder/new-folder.mjs (1)

6-6: Polish description grammar.
Minor wording tweak for clarity.

-  description: "Emit new event when a new folder created on a target. [See the documentation](https://developer.box.com/reference/post-webhooks)",
+  description: "Emit an event when a new folder is created. [See the documentation](https://developer.box.com/reference/post-webhooks)",
components/box/actions/upload-file/upload-file.mjs (1)

65-69: Prefer FormData helpers over private _boundary.
Align with best practice and future-proof boundary handling.

-      headers: {
-        "Content-Type": `multipart/form-data; boundary=${data._boundary}`,
-      },
+      headers: data.getHeaders?.() ?? {
+        "Content-Type": `multipart/form-data; boundary=${data.getBoundary?.() ?? data._boundary}`,
+      },
components/apify/common/constants.mjs (1)

1-3: Nit: document the hard-coded WCC actor ID

Add a brief comment (actor name, link) so future changes to the WCC actor are easier to track.

-export const WCC_ACTOR_ID = "aYG0l9s7dbB7j3gbS";
+// Web Content Crawler (official Apify actor). Update if the canonical actor changes.
+export const WCC_ACTOR_ID = "aYG0l9s7dbB7j3gbS";
components/apify/actions/scrape-single-url/scrape-single-url.mjs (1)

67-73: Tighten dataset read and handle empty results

Limit to 1 item and fail clearly when nothing is extracted. Also surface the console URL in the summary for quick triage.

-    const { items } = await this.apify.listDatasetItems({
-      datasetId: defaultDatasetId,
-    });
+    const { items } = await this.apify.listDatasetItems({
+      datasetId: defaultDatasetId,
+      limit: 1,
+      offset: 0,
+    });
 
-    $.export("$summary", "Run of Web Content Crawler finished successfully.");
-    return items[0];
+    if (!items?.length) {
+      throw new Error("No result extracted for the provided URL.");
+    }
+    $.export("$summary", `Web Content Crawler run succeeded. Inspect: ${consoleUrl}`);
+    return items[0];
components/apify/actions/get-dataset-items/get-dataset-items.mjs (2)

61-72: Optional: avoid over-fetching when limit is small.

Dynamically reduce page size to the remaining budget.

Apply:

-    do {
-      const { items } = await this.apify.listDatasetItems({
+    do {
+      const remaining = Number.isInteger(this.limit) ? Math.max(this.limit - results.length, 0) : LIMIT;
+      params.limit = Math.min(LIMIT, remaining || LIMIT);
+      const { items } = await this.apify.listDatasetItems({
         datasetId: this.datasetId,
         params,
       });
       results.push(...items);
-      if (results.length >= this.limit) {
+      if (Number.isInteger(this.limit) && results.length >= this.limit) {
         break;
       }
       total = items?.length;
       params.offset += LIMIT;
     } while (total);

74-76: Guard trim when limit is unset.

Only trim when limit is a number.

Apply:

-    if (results.length > this.limit) {
-      results.length = this.limit;
-    }
+    if (Number.isInteger(this.limit) && results.length > this.limit) {
+      results.length = this.limit;
+    }
components/apify/actions/set-key-value-store-record/set-key-value-store-record.mjs (2)

12-24: Required store/key: good; also validate non-empty key.

Add a simple runtime check to prevent empty or whitespace-only keys.

Apply:

   async run({ $ }) {
+    if (!this.key || !String(this.key).trim()) {
+      throw new Error("Key is required and cannot be empty.");
+    }

33-76: Robust value inference; consider binary case.

If consumers pass a Buffer or base64 blob, you may want to detect and set contentType: application/octet-stream.

Example enhancement (outside current diff scope):

if (typeof Buffer !== "undefined" && Buffer.isBuffer(input)) {
  return { data: input, contentType: "application/octet-stream", mode: "binary" };
}
components/apify/actions/run-actor/run-actor.mjs (1)

187-193: Avoid “undefined ...” in descriptions.

When value.description is missing, concatenations produce “undefined ...”.

Apply:

         props[key] = {
           type: this.getType(value.type),
           label: value.title,
-          description: value.description,
+          description: value.description || "",
           optional: !requiredProps.includes(key),
         };

(With this change, later += appends are safe.)

Also applies to: 200-205, 221-222

components/apify/apify.app.mjs (2)

134-139: Grammar nit: clarify the offset description.

Apply:

-      description: "The number records to skip before returning results",
+      description: "The number of records to skip before returning results",

142-154: Optional: cache the ApifyClient instance.

Avoid recreating the client on every call.

Example:

-  _client() {
-    return new ApifyClient({
+  _client() {
+    if (!this.__apifyClient) {
+      this.__apifyClient = new ApifyClient({
         token: this.$auth.api_token,
         requestInterceptors: [
           (config) => ({
             ...config,
             headers: {
               ...(config.headers || {}),
               "x-apify-integration-platform": "pipedream",
             },
           }),
         ],
-    });
+      });
+    }
+    return this.__apifyClient;
   },
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1725ae6 and 25691bf.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (19)
  • components/apify/actions/get-dataset-items/get-dataset-items.mjs (2 hunks)
  • components/apify/actions/run-actor/run-actor.mjs (3 hunks)
  • components/apify/actions/run-task-synchronously/run-task-synchronously.mjs (3 hunks)
  • components/apify/actions/scrape-single-url/scrape-single-url.mjs (2 hunks)
  • components/apify/actions/set-key-value-store-record/set-key-value-store-record.mjs (2 hunks)
  • components/apify/apify.app.mjs (6 hunks)
  • components/apify/common/constants.mjs (1 hunks)
  • components/apify/package.json (1 hunks)
  • components/apify/sources/common/base.mjs (2 hunks)
  • components/apify/sources/new-finished-actor-run-instant/new-finished-actor-run-instant.mjs (2 hunks)
  • components/apify/sources/new-finished-task-run-instant/new-finished-task-run-instant.mjs (1 hunks)
  • components/box/actions/download-file/download-file.mjs (1 hunks)
  • components/box/actions/get-comments/get-comments.mjs (1 hunks)
  • components/box/actions/search-content/search-content.mjs (1 hunks)
  • components/box/actions/upload-file-version/upload-file-version.mjs (1 hunks)
  • components/box/actions/upload-file/upload-file.mjs (1 hunks)
  • components/box/sources/new-event/new-event.mjs (1 hunks)
  • components/box/sources/new-file/new-file.mjs (1 hunks)
  • components/box/sources/new-folder/new-folder.mjs (1 hunks)
🔇 Additional comments (25)
components/box/actions/download-file/download-file.mjs (1)

10-10: Version bump only — looks good.
No functional diffs here.

components/apify/sources/new-finished-task-run-instant/new-finished-task-run-instant.mjs (1)

9-9: Version bump only — looks good.
No logic changes.

components/box/actions/upload-file-version/upload-file-version.mjs (1)

8-8: Version bump only — looks good.

components/box/sources/new-file/new-file.mjs (1)

7-7: Version bump only — looks good.

components/box/sources/new-folder/new-folder.mjs (1)

7-7: Version bump only — looks good.

components/box/actions/search-content/search-content.mjs (1)

9-9: Version bump only — looks good.

components/box/actions/upload-file/upload-file.mjs (1)

8-8: Version bump only — looks good.

components/box/actions/get-comments/get-comments.mjs (1)

8-8: Version bump only — looks good.

components/box/sources/new-event/new-event.mjs (1)

8-8: LGTM: version bump only

No functional changes. Safe to merge.

components/apify/sources/new-finished-actor-run-instant/new-finished-actor-run-instant.mjs (1)

15-20: Remove backward compatibility concern for userActorId
No occurrences of userActorId in the repo and this is a newly added source, so renaming to actorId cannot break any existing configs.

Likely an incorrect or invalid review comment.

components/apify/common/constants.mjs (1)

1-3: Refine search to apify constants only
The broad grep is matching every EVENT_TYPES in other components. To confirm removal of legacy ACTOR_ID or EVENT_TYPES from components/apify/common/constants.mjs, restrict the search to imports from that file rather than all occurrences.

Likely an incorrect or invalid review comment.

components/apify/actions/run-task-synchronously/run-task-synchronously.mjs (3)

91-94: Good: explicit non-SUCCEEDED handling with console URL.


106-118: Thoughtful structured return + summary.


2-2: No action needed: @apify/consts is declared in components/apify/package.json (line 16).

components/apify/actions/get-dataset-items/get-dataset-items.mjs (2)

8-8: Version bump looks good.


36-47: ```shell
#!/bin/bash
set -euo pipefail

echo "Finding apify.app.mjs file location..."
fd --type f apify.app.mjs || echo "apify.app.mjs not found"

echo "Searching for propDefinitions in all .mjs files..."
rg -n "propDefinitions" -C3 --glob '*.mjs' || echo "No propDefinitions found"

echo "Searching for 'offset:' occurrences..."
rg -nP 'offset\s*:' -C3 --glob '*.mjs' || echo "No offset occurrences found"

echo "Searching for 'limit:' occurrences..."
rg -nP 'limit\s*:' -C3 --glob '*.mjs' || echo "No limit occurrences found"


</blockquote></details>
<details>
<summary>components/apify/actions/set-key-value-store-record/set-key-value-store-record.mjs (4)</summary><blockquote>

`6-9`: **Clearer description + version bump: nice.**

---

`93-103`: **API call + summary formatting look good.**

---

`105-112`: **Structured return payload: good for downstream steps.**

---

`26-31`: **Type `any` is supported for action props.**

Pipedream allows `"any"` in action prop schemas, so `type: "any"` for the `Value` prop is valid and requires no change.

</blockquote></details>
<details>
<summary>components/apify/actions/run-actor/run-actor.mjs (3)</summary><blockquote>

`232-241`: **Nice UX: Output Record Key only when sync.**

Conditionally exposing `outputRecordKey` for synchronous runs is spot on.

---

`148-152`: **Enum handling looks good.**

Mapping `enum` + `enumTitles` to `{ value, label }` options is correct.

---

`339-345`: **Fix: wrong argument names when calling apify.runActor (sync path).**

The app method expects `{ actorId, input, options }`, but `{ data, params }` are passed; input/options are ignored, leading to empty input and default options.

Apply:

```diff
       run = await apify.runActor({
         actorId,
-        data: input,
-        params,
+        input,
+        options: params,
       });

Likely an incorrect or invalid review comment.

components/apify/apify.app.mjs (2)

267-274: Label formatting looks solid.

Good, consistent labels for actors and tasks.


217-219: No change requiredclient.store().list(opts) is the documented method for listing public actors via the /v2/store endpoint; there is no store().actors().list(...) or actors().listPublic(...) in the ApifyClient API.

Likely an incorrect or invalid review comment.

@matyascimbulka
Copy link
Contributor Author

Hello @michelle0927, thank you for the review. I have modified the versions based on your suggestions.

michelle0927
michelle0927 previously approved these changes Sep 19, 2025
Copy link
Collaborator

@michelle0927 michelle0927 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making those changes! Ready for QA!

@michelle0927 michelle0927 moved this from Changes Required to Ready for QA in Component (Source and Action) Backlog Sep 19, 2025
@vunguyenhung vunguyenhung moved this from Ready for QA to In QA in Component (Source and Action) Backlog Sep 21, 2025
@vunguyenhung vunguyenhung moved this from In QA to Ready for QA in Component (Source and Action) Backlog Sep 21, 2025
@vunguyenhung vunguyenhung moved this from Ready for QA to In QA in Component (Source and Action) Backlog Sep 21, 2025
@vunguyenhung vunguyenhung moved this from In QA to Changes Required in Component (Source and Action) Backlog Sep 21, 2025
@vunguyenhung
Copy link
Collaborator

Hello everyone, I have tested this PR and there're some test cases failed or needed improvement.

Please check the test report below for more information
https://vunguyenhung.notion.site/APP-Apify-Improvements-to-various-components-273bf548bb5e819682dbd3a86b66100a

@matyascimbulka
Copy link
Contributor Author

@vunguyenhung Thank you for testing the PR. I have fixed the issues that you found.

@vunguyenhung vunguyenhung moved this from Changes Required to Ready for PR Review in Component (Source and Action) Backlog Sep 22, 2025
@michelle0927 michelle0927 moved this from Ready for PR Review to In Review in Component (Source and Action) Backlog Sep 22, 2025
Copy link
Collaborator

@michelle0927 michelle0927 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @matyascimbulka, I noticed a couple typos in the run-actor action. Just need to update those, and this will be ready for QA again.

@michelle0927 michelle0927 moved this from In Review to Changes Required in Component (Source and Action) Backlog Sep 22, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (7)
components/apify/actions/scrape-single-url/scrape-single-url.mjs (3)

68-70: Limit dataset fetch and return clean items.
Avoid fetching more than needed and prefer cleaned items.

Apply this diff:

-    const { items } = await this.apify.listDatasetItems({
-      datasetId: defaultDatasetId,
-    });
+    const { items } = await this.apify.listDatasetItems({
+      datasetId: defaultDatasetId,
+      params: { limit: 1, clean: true },
+    });

72-73: Handle empty dataset explicitly.
Avoid returning undefined when no items are produced.

Apply this diff:

-    $.export("$summary", "Run of Web Content Crawler finished successfully.");
-    return items[0];
+    const item = items && items[0];
+    if (!item) {
+      throw new Error(`No dataset items returned for URL: ${this.url}. Inspect the run: ${consoleUrl}.`);
+    }
+    $.export("$summary", "Run of Web Content Crawler finished successfully.");
+    return item;

8-8: Update description to reflect current behavior.
The action now returns the first dataset item, not “content as HTML”.

Apply this diff:

-  description: "Executes a scraper on a specific website and returns its content as HTML. This action is perfect for extracting content from a single page. [See the documentation](https://docs.apify.com/sdk/js/docs/examples/crawl-single-url)",
+  description: "Runs the Web Content Crawler on a single URL and returns the first item from its dataset after the run completes. [See the documentation](https://docs.apify.com/sdk/js/docs/examples/crawl-single-url)",
components/apify/actions/run-actor/run-actor.mjs (4)

135-144: Exclude non-schema props from the actor input (e.g., actorSource).

Unknown props can leak into the run input. Filter by schema keys and/or exclude at destructure.

Apply:

       for (const [
         key,
         value,
       ] of Object.entries(data)) {
-        const editor = properties[key]?.editor || "hidden";
+        if (!properties[key]) continue;
+        const editor = properties[key].editor || "hidden";
         newData[key] = Array.isArray(value)
           ? value.map((item) => this.setValue(editor, item))
           : value;
       }

Also remove actorSource from the residual data:

   async run({ $ }) {
     const {
       apify,
       actorId,
+      actorSource,
       buildTag,

Also applies to: 256-268


199-201: Prevent “undefined …” in descriptions when schema lacks description.

String concatenation on undefined yields “undefined …”.

Apply:

-          if (value.unit) {
-            props[key].description += ` Unit: ${value.unit}.`;
-          }
+          if (value.unit) {
+            props[key].description = `${props[key].description ?? ""} Unit: ${value.unit}.`;
+          }
-          props[key].description += ` Default: \`${JSON.stringify(defaultValue)}\``;
+          props[key].description = `${props[key].description ?? ""} Default: \`${JSON.stringify(defaultValue)}\``;

Also applies to: 221-221


146-153: Support enums without titles.

Fallback labels to enum values when enumTitles are absent.

Apply:

   prepareOptions(value) {
-      if (value.enum && value.enumTitles) {
+      if (value.enum && value.enumTitles) {
         return value.enum.map((val, i) => ({
           value: val,
           label: value.enumTitles[i],
         }));
+      } else if (value.enum) {
+        return value.enum.map((val) => ({
+          value: val,
+          label: String(val),
+        }));
       }
     },

4-4: Default webhook event types to terminal events when not provided.

Improves UX and aligns with Apify guidance.

Apply:

-import { WEBHOOK_EVENT_TYPES } from "@apify/consts";
+import { WEBHOOK_EVENT_TYPES, WEBHOOK_EVENT_TYPE_GROUPS } from "@apify/consts";
       ...(webhook && {
         webhooks: [
           {
-            eventTypes,
+            eventTypes: (eventTypes && eventTypes.length)
+              ? eventTypes
+              : WEBHOOK_EVENT_TYPE_GROUPS.ACTOR_RUN_TERMINAL,
             requestUrl: webhook,
           },
         ],
       }),

Also applies to: 317-324

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3861c6b and 77473f9.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (5)
  • components/apify/actions/get-dataset-items/get-dataset-items.mjs (2 hunks)
  • components/apify/actions/run-actor/run-actor.mjs (3 hunks)
  • components/apify/actions/scrape-single-url/scrape-single-url.mjs (1 hunks)
  • components/apify/package.json (2 hunks)
  • components/apify/sources/new-finished-actor-run-instant/new-finished-actor-run-instant.mjs (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • components/apify/sources/new-finished-actor-run-instant/new-finished-actor-run-instant.mjs
  • components/apify/package.json
  • components/apify/actions/get-dataset-items/get-dataset-items.mjs
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-12T08:28:06.736Z
Learnt from: matyascimbulka
PR: PipedreamHQ/pipedream#18308
File: components/apify/sources/common/base.mjs:17-0
Timestamp: 2025-09-12T08:28:06.736Z
Learning: WEBHOOK_EVENT_TYPE_GROUPS.ACTOR_RUN_TERMINAL from apify/consts is an array containing all terminal Actor run event types: ["ACTOR.RUN.SUCCEEDED", "ACTOR.RUN.FAILED", "ACTOR.RUN.ABORTED", "ACTOR.RUN.TIMED_OUT"]. It should be used directly in the eventTypes field when creating webhooks for future-proofing.

Applied to files:

  • components/apify/actions/run-actor/run-actor.mjs
🧬 Code graph analysis (1)
components/apify/actions/scrape-single-url/scrape-single-url.mjs (2)
components/apify/common/constants.mjs (2)
  • WCC_ACTOR_ID (1-1)
  • WCC_ACTOR_ID (1-1)
components/apify/actions/run-task-synchronously/run-task-synchronously.mjs (1)
  • defaultDatasetId (95-104)
🔇 Additional comments (11)
components/apify/actions/scrape-single-url/scrape-single-url.mjs (5)

2-3: Imports look correct and necessary.
WCC actor ID and job status enum are used appropriately.


9-9: Version bump acknowledged.
0.1.1 reflects behavior change (waits for run and returns dataset item).


17-18: Requiring URL is correct.
Prevents accidental empty runs.


19-41: Default crawlerType added — resolves prior feedback.
Matches the “Stealthy web browser (default)” doc text.


64-67: Status gating on SUCCEEDED is solid.
Including consoleUrl in the error is helpful.

components/apify/actions/run-actor/run-actor.mjs (6)

10-10: Version bump looks good.


341-345: Corrected sync call shape (input/options) — good fix.


285-292: Accept tag or build ID when validating build.

Currently only tags pass validation. Allow build IDs too to avoid false negatives.

Apply:

-    if (buildTag) {
-      const taggedBuilds = actorDetails.taggedBuilds || {};
-      if (!taggedBuilds[buildTag]) {
-        throw new Error(
-          `Build with tag "${buildTag}" was not found for actor "${actorDetails.title || actorDetails.name}".`,
-        );
-      }
-    }
+    if (buildTag) {
+      const taggedBuilds = actorDetails.taggedBuilds || {};
+      const isKnownTag = Boolean(taggedBuilds[buildTag]);
+      const isKnownId = Object.values(taggedBuilds).some((b) => b.buildId === buildTag);
+      if (!isKnownTag && !isKnownId) {
+        throw new Error(
+          `Build with tag or ID "${buildTag}" was not found for actor "${actorDetails.title || actorDetails.name}".`,
+        );
+      }
+    }

102-130: Resolve schema for either build tag or build ID (fallback to latest).

Makes schema resolution robust when users select a tag or supply an ID.

Apply:

-    async getSchema(actorId, buildTag) {
-      const build = await this.apify.getBuild(actorId, buildTag);
-      if (!build) {
-        throw new Error(`No build found for actor ${actorId}`);
-      }
+    async getSchema(actorId, buildRef) {
+      let build;
+      if (buildRef) {
+        const actor = await this.apify.getActor({ actorId });
+        const resolvedBuildId = actor?.taggedBuilds?.[buildRef]?.buildId ?? buildRef;
+        build = await this.apify.getBuild(actorId, resolvedBuildId);
+      } else {
+        build = await this.apify.getBuild(actorId);
+      }
+      if (!build) {
+        throw new Error(`No build found for actor ${actorId} using ref "${buildRef}"`);
+      }

243-250: Event type options source looks correct.

If Apify adds new events, this will auto-include them. Confirm the UI handles a long list gracefully.


331-335: No change required — async API expects data/params.
components/apify/apify.app.mjs defines runActorAsynchronously({ actorId, data, params }) and calls .start(data, params), so the current call (data: input, params) is correct.

Copy link
Collaborator

@michelle0927 michelle0927 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@michelle0927 michelle0927 moved this from Changes Required to Ready for QA in Component (Source and Action) Backlog Sep 23, 2025
@vunguyenhung vunguyenhung moved this from Ready for QA to In QA in Component (Source and Action) Backlog Sep 24, 2025
@vunguyenhung vunguyenhung moved this from In QA to Ready for Release in Component (Source and Action) Backlog Sep 24, 2025
@vunguyenhung
Copy link
Collaborator

Hi everyone, all test cases are passed! Ready for release!

Test report
https://vunguyenhung.notion.site/APP-Apify-Improvements-to-various-components-273bf548bb5e819682dbd3a86b66100a

@vunguyenhung vunguyenhung merged commit 29b3fc0 into PipedreamHQ:master Sep 24, 2025
10 checks passed
@github-project-automation github-project-automation bot moved this from Ready for Release to Done in Component (Source and Action) Backlog Sep 24, 2025
@matyascimbulka matyascimbulka deleted the feat/apify-upgrades branch September 25, 2025 11:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

User submitted Submitted by a user

Development

Successfully merging this pull request may close these issues.

7 participants